Method and device for storing an image into a memory

ABSTRACT

Aspects of the disclosure provide a method and device for storing an input image into a memory. The disclosure describes allocating one or more frame buffers in the memory. The disclosure further describes dividing the input image into access units corresponding to subsets of the input image and allocating a main portion and a secondary portion in the frame buffer for each of the access units, wherein at least one of the secondary portions is not sequentially located after its respective main portion within the frame buffer. The disclosure also describes compressing the access units into compressed access units and storing each of the compressed access units into its respective main portion, and if a size of the compressed access unit exceeds a size of the main portion, then storing a remainder of the compressed access unit into its respective secondary portion.

INCORPORATION BY REFERENCE

This present disclosure claims the benefit of U.S. Provisional Application No. 62/409,416, “UFO Format” filed on Oct. 18, 2016, and U.S. Provisional Application No. 62/489,588, “Memory Access Efficiency Optimization for Frame Buffer Compression” filed on Apr. 25, 2017, which are incorporated herein by reference in their entirety.

BACKGROUND

The background description provided herein is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent the work is described in this background section, as well as aspects of the description that may not otherwise qualify as prior art at the time of filing, are neither expressly nor impliedly admitted as prior art against the present disclosure.

An electronic device, such as a computer system, can include one or more memories. In an example, the electronic device includes a component, such as a central processing unit (CPU) located on a different integrated circuit chip than the memory, that accesses the memory a memory controller. Memory access by the CPU can generate heavy data traffic between the CPU and the memory.

SUMMARY

Aspects of the disclosure provide a method for storing an input image into a memory. The method can include allocating one or more frame buffers in the memory, and dividing the input image into access units corresponding to subsets of the input image and allocating a main portion and a secondary portion in the frame buffer for each of the access units, wherein at least one of the secondary portions is not sequentially located after its respective main portion within the frame buffer. The method can further include compressing the access units into compressed access units, and storing each of the compressed access units into its respective main portion, and if a size of the compressed access unit exceeds a size of the main portion, then storing a remainder of the compressed access unit into its respective secondary portion.

In embodiments, the memory can have a sequence of memory blocks separated by memory boundaries at addresses that are multiples of a memory block size and the memory block size is determined based on characteristics of the memory and devices accessing the memory. The size of the main portions can be one or multiples of the memory block size and starting addresses of the main portions are aligned to memory boundaries. Further, a size of the main portions can be a fraction of the memory block size and each main portion is located within a respective memory block. A size of the secondary portions can be a fraction of the memory block size and secondary portions are grouped into one or more secondary portion groups. The one or more secondary portion groups can have a size being one or multiples of the memory block size and the first secondary portion of each secondary portion group has a starting address aligned to a memory boundary.

In embodiments, a plurality of main portions and secondary portions can be arranged in a predetermined pattern to form a superblock having a size that is one or multiples of the memory block size. The main and secondary portions in the frame buffer can be arranged by sequentially positioning a plurality of superblocks adjacent to each other.

In embodiments of the disclosure, the memory block size can be selected to be 32, 64, 128, 256, 512, 1K, 2K, or 4K bytes.

Aspects of the disclosure can also provide a device for storing an input image. The device can include a memory configured to store one or more frame buffers, and a memory distribution device configured to receive the input image, allocate a frame buffer in the memory to store the input image, divide the input image into access units corresponding to subsets of the input image, and allocate a main portion and a secondary portion in the frame buffer for each of the access units, wherein at least one of the secondary portions is not sequentially located after its respective main portion within the frame buffer. The device can further include a memory controller configured to store, in response to instructions of the memory distribution device, each of the compressed access units into its respective main portion, and if a size of the compressed access unit exceeds a size of the main portion, then store a remainder of the compressed access unit into its respective secondary portion.

In exemplary embodiments, the memory can have a sequence of memory blocks separated by memory boundaries at addresses that are multiples of a memory block size. The memory block size can be determined based on characteristics of the memory and devices accessing the memory including the memory distribution device. The memory distribution device can be configured to select a size of the main portions to be one or multiples of the memory block size and align starting addresses of the main portions to memory boundaries. Further, the memory distribution device can be configured to select a size of the main portions to be a fraction of the memory block size and position each main portion within a respective memory block. The memory distribution device can be configured to a select a size of the secondary portions to be a fraction of the memory block size and group secondary portions into one or more secondary portion groups. The memory distribution device can be configured to select a size of the one or more secondary portion groups to be one or multiples of the memory block size and align the starting address of the first secondary portion of each secondary portion group to a memory boundary.

In alternative embodiments, the memory distribution device can be configured to arrange a plurality of main portions and secondary portions in a predetermined pattern to form a superblock having a size being one or multiples of the memory block size. The superblocks can be positioned adjacent to each other in the frame buffer.

In embodiments, the memory can be located on a different integrated circuit chip than the memory distribution device. Further, the memory distribution device can be integrated into a video codec. Additionally, the memory distribution device can be configured to select the memory block size to be 32, 64, 128, 256, 512, 1K, 2K, or 4K bytes.

Alternative aspects of the disclosure can provide a non-transitory computer readable medium having computer readable instructions stored thereon which, when executed by a processing circuit, cause the processing circuit to perform a method. The method can include allocating one or more frame buffers in the memory, and dividing the input image into access units corresponding to subsets of the input image and allocating a main portion and a secondary portion in the frame buffer for each of the access units, wherein at least one of the secondary portions is not sequentially located after its respective main portion within the frame buffer. The method can further includes compressing the access units into compressed access units, and storing each of the compressed access units into its respective main portion, and if a size of the compressed access unit exceeds a size of the main portion, then storing a remainder of the compressed access unit into its respective secondary portion.

BRIEF DESCRIPTION OF THE DRAWINGS

Various embodiments of this disclosure that are proposed as examples will be described in detail with reference to the following figures, wherein like numerals reference like elements, and wherein:

FIG. 1 shows an exemplary block diagram of a memory system 100 according to an embodiment of the disclosure;

FIG. 2 shows exemplary data structures 200 according to an embodiment of the disclosure;

FIG. 3 shows three exemplary superblocks 341A-341C in three frame buffers 331A-331C according to an embodiment of the disclosure;

FIG. 4 shows three exemplary superblocks 441A-441C in three frame buffers 431A-431C according to an embodiment of the disclosure;

FIG. 5 shows two exemplary superblocks 541A-541B in two frame buffers 531A-531B according to an embodiment of the disclosure;

FIG. 6 shows alternative frame buffer examples according to an embodiment of the disclosure; and

FIG. 7 shows a flow chart outlining an exemplary process 700 according to an embodiment of the disclosure.

DETAILED DESCRIPTION OF EMBODIMENTS

FIG. 1 shows an exemplary block diagram of a memory system 100 according to an embodiment of the disclosure. As shown, the memory system 100 can include a memory distribution device 110, a memory controller 120, and a memory 130. The memory 130 can include a frame butler 131. The memory system 100 is configured to divide an input image into one or more access units and to store each compressed access unit into a main portion and la secondary portion allocated in the frame buffer 131 for the respective access unit.

The memory system 100 can be any suitable system for storing data. In an embodiment, the memory system 100 is an electronic device, such as a desktop computer, a tablet computer, a smart phone, a wearable device, a smart TV, a camera, a camcorder, a media player, and the like. In an exemplary embodiment, the memory system 100 can further include other components accessing the data stored in the memory 130. For example, other components can include a central processing unit (CPU) 141, a graphics processing unit (GPU) 142, a multimedia engine 143, display circuitry 144, an image processor 145, a video codec 146, and the like.

In an embodiment, the memory 130 can have a sequence of memory blocks separated by memory boundaries, for example, at every 32, 64, 128, 256, 512, 1K, 2K, or 4K bytes based on page size or channel division. Accessing a certain amount of data stored within a memory block in between two adjacent boundaries is more efficient than accessing the same amount of data separately stored in two memory blocks across a memory boundary. As a result, data in the memory 130 can be accessed efficiently by another component of the memory system 100 when a starting address of the data is aligned to a memory boundary. Memory boundaries are formed at addresses that are multiples of a memory block size. In one embodiment, the memory block size can be a certain amount of data that can be rapidly transferred between the memory 130 and the other components of the memory system 100 in a sequence of burst read/write command with single or a few pre-charge and active commands. The memory block size can be selected based on characteristics of the memory 130 and other components of the memory system 100 accessing the memory 130, such as page size and channel division of the memory 130 as well as architecture and operation mode of the memory 130 and other components of the memory system 100 accessing the memory 130.

The memory distribution device 110 is configured to receive and divide the input image into one or more access units. The memory distribution device 110 is further configured to allocate a portion of the memory 130, such as the frame buffer 131, to the input image and allocate to each access unit two memory portions, a main portion and a secondary portion, in the frame buffer 131. In an example, a starting address of the frame buffer 131 can be aligned to a memory boundary, such as 0 bytes. The memory distribution device 110 is configured to compress each access unit and store each compressed access unit into the respective main portion, and if a size of the compressed access unit exceeds a size of the main portion, then store a remainder part of the compressed access unit into its respective secondary portion. In an embodiment, the memory distribution device 110 can be integrated into any component accessing data stored in the memory 130, such as one or more components of the memory system 100 including the CPU 141, the GPU 142, the multimedia engine 143, the display circuitry 144, the image processor 145, and the video codec 146, and the like.

In an embodiment, the main portions can have starting addresses aligned to memory boundaries and sizes that are one or multiples of the memory block size. Accordingly, data stored in the main portions can be efficiently accessed. Alternatively, when the size of the main portion is smaller than the memory block size, each main portion can be located within the respective memory block while one or more main portions can have starting addresses aligned to one or more memory boundaries.

The size of the secondary portion can be a fraction of the memory block size. As a result, two or more secondary portions can be grouped together and stored separately from their respective main portions. When a size of a compressed access unit is less than or equal to that of the main portion, the compressed access unit can be completely stored inside the main portion, without using the secondary portion. As a result, accessing the compressed access unit, can be done efficiently because the main portion can be accessed efficiently.

In another embodiment, at least one of the secondary portions is not sequentially located after its respective main portion within the frame buffer, including in a reversed order where the main portion has a larger address than that of its respective secondary portion.

The memory controller 120 is configured to manage memory access from the memory distribution device 110 to the memory 130. The memory controller 120 can be configured to receive requests from the memory distribution device 110 to store a compressed access unit to the respective main and secondary portion in the frame buffer 131 of the memory 130. Based on the requests, the memory controller 120 can transmit commands to the memory 130 with instructions to store the compressed access unit to the respective main and secondary portion in the frame buffer 131. The memory controller 120 can also be configured to schedule and buffer the requests and the like.

The memory 130 can be any suitable device for storing data. in an embodiment, the memory 130 includes a dynamic random access memory (DRAM) type memory module, such as double data rate synchronous DRAM (DDR, SDRAM), double data rate two synchronous. DRAM (DDR2 SDRAM), double data rate three synchronous DRAM (DDR3 SDRAM), double data rate four synchronous DRAM (DDR4 SDRAM), low power DDR SDRAM (LPDDR SDRAM), and the like.

In an embodiment, the memory system 100 can be a system-on chip (SOC) where all the components are located on a single integrated circuit (IC) chip. Additionally, the other components such as the CPU 141, GPU 142, multimedia engine 143, display circuitry 144, image processor 145, and video coder 146 can also be included on the same single IC chip. Alternatively, the components in the memory system 100 can be distributed across several ICs. For example, the memory distribution device 110, the memory controller 120, the memory 130, and other components of the memory system 100 can be located on multiple IC chips. Further, the memory distribution device 110 can be integrated into any component that accesses data stored in the memory 130, such as one or more components of the memory system 100 including the CPU 141, the GPS 142, the multimedia engine 143, the display circuitry 144, the image processor 145, and the video codec 146, and the like.

During operation, an input image can be received by the memory distribution device 110. The memory distribution device 110 can divide the input image into one or more access units. Further, the memory distribution device 110 can allocate a portion of the memory 130, such as the frame buffer 131, to the input image. Two memory portions, a main portion and a secondary portion, are allocated for each access unit in the frame buffer 131. Under the instruction of the memory distribution device 110, the memory controller 120 can store compressed access units into their respective main portions and, depending on size, also their secondary portions. The main portions can have starting addresses aligned to memory boundaries and a size that can be one or multiples of the memory block size. The size of the secondary portion can be a fraction of the memory block size. As a result, two or more secondary portions can be grouped together and stored separately from their main portions. When the size of the compressed access unit is less than or equal to that of the main portion, the compressed access unit can be completely stored in the main portion without using the secondary portion. As a result, accessing the compressed access unit can be done efficiently.

FIG. 2 shows exemplary data structures 200 according to embodiments of the disclosure illustrating an input image 210 divided into access units, as well as frame buffers 231A and 231B. As shown, the input image 210 can be divided into an N×M array of access units. Within the array, the size of an access unit depends on the number of pixels in the access unit and the pixel bit-depth. The pixel bit-depth is the number of bits used to specify a color of a pixel, such as 10 or 12 bits corresponding to 1024 or 4096 colors, respectively. In an example, the number of pixels in the access unit can depend on a compression method used by the memory distribution device 110, such as the size of a compression unit on which the compression method operates. For example, the size of the compression unit can be 4×4, 8×8, 16×4, 16×8, 16×16 pixels, and the like. An access unit can have one or more compression units.

The frame buffer 231A illustrates an exemplary frame buffer structure for storing the input image. Frame buffers can be memories having addressable locations for storing data. The addressable locations within the memories can be grouped into memory blocks having a memory block size. As discussed above, the memory block size can be selected based on characteristics of the memory 130 and other components of the memory system 100 accessing the memory 130, such as page size and channel division of the memory 130 as well as architecture and operation mode of the memory 130 and other components of the memory system 100 accessing the memory 130. In an example, the memory block size can be selected to be 32, 64, 128, 256, 512, 1K, 2K, 4K bytes, or the like. For example, the memory 130 can be a DDR3S DRAM device and data is retrieved from the memory 130. The memory block size is the amount of data that can be retrieved from the memory 130 in a single read cycle. In particular, the memory block size can be 32 bytes when the, data bus width and burst length are 64-bit (8 bytes) and 4, respectively. In another example, the memory block size can be determined by a size of a CPU 141 or GPU 142 cache line that accesses the memory to be 64, 128 bytes or the like. In the FIG. 2 example, the memory block size is 64 bytes, and thus the memory boundaries 250(1) to 250(n) are at addresses 0, 64, 128, 192 bytes, and etc. in the frame buffers 231A and 231B.

A main portion and a secondary portion can be allocated for each access unit. In an embodiment, a size of the main portions can depend on the compressibility of the input image, the compression method, the memory block size, and the like. In addition, in an embodiment, the summation of a size of the main portion and a size of the secondary portion can be equal to the size of the access unit. Similarly, a ratio of the size of the main portion over the size of the secondary portion can depend on the compressibility of the input image and the compression method, and the like. For example, when the access units can be compressed to smaller sizes, smaller main portions can be sufficient to store the compressed access units and the respective secondary portions can remain empty, also resulting in a smaller ratio of the size of the main portion over the size of the secondary portion. For example, the ratio of the size of the main portion over the size of the secondary portion can be 2, 4, 8, or the like.

Further, the main portions can have starting addresses that are aligned to memory boundaries and sizes that are multiples of the memory block size so that the data stored in the main portions can be efficiently accessed. In the example shown in FIG. 2, the size of the main portions 221(1)-221(3) can be selected to have a memory block size of 64 bytes and starting addresses of the main portions 221(1)-221(3) are aligned to the memory boundaries 250(1)-250(3), respectively. A size of the respective secondary portions 241(1)-241(3) can be selected to be smaller than 64 bytes, for example, 32 bytes.

The frame buffer 231B shows an example when compressed access units 261-263 having various sizes are stored. The compressed access units 261-263 can be stored in their respective main 221(1)-221(3) and secondary portions 241(1)-241(3). In the FIG. 2 example, the size of the compressed access unit 261 is less than 64-byte memory block size. As a result, the compressed access unit 261 can be stored, completely within the Main portion 221(1), while the secondary portion 241(1) remains empty. The size of the compressed access unit 262 is equal to 64-byte memory block size. As a result, the compressed access unit 262 can also be stored completely within the, main portion 221(2), while the secondary portion 241(2) remains empty. However, the size of the compressed access unit 263 is larger than 64-byte memory block size. As a result, a first part of the compressed access unit 263 can fill the main portion 221(3) and a second part or a remainder part of the compressed access unit 263 can be stored in the secondary portion 241(3).

In various embodiments, the size of the access unit and the sizes of the main and secondary portion can be selected and remain constant for an input image. On the other hand, multiple input images, such as sequential frames of a video, can be stored by the memory system 100. The size of the access unit and the sizes of the main and secondary portion can be selected for each individual input image, and thus can dynamically vary from one input image to another.

The main and secondary portions can be arranged in the frame buffer 131 according to various layouts. FIGS. 3-5 show exemplary layouts including repeating patterns of the main and secondary portions in a periodic manner, where, a smallest repeating unit is a superblock. Accordingly, the main, and secondary portions in the frame buffer 131 can be arranged by, for example, sequentially positioning superblocks next to each other. In an embodiment, the size of superblocks can be multiples of the memory block size.

FIG. 3 shows three exemplary superblocks 341A-341C in three frame buffers 331A-331C according to an embodiment of the disclosure. The superblocks 341A-341C share a common feature where all secondary portions in a superblock are collected as a secondary portion group that is located in the middle of the respective superblock. The starting addresses of the main portions are aligned to memory boundaries. The size of the secondary portion group is one or multiples of the memory block size and the first secondary portion of the secondary portion group is aligned to a memory boundary.

Referring to the superblock 341A, the size of the access unit is set to 160 bytes, the memory block size is set to 128 bytes, and the sizes of the main and secondary portion are set to 128 and 32 bytes, respectively. As shown, the superblock pattern has a secondary portion group having four secondary portions (S₀-S₃) that is sandwiched between a first main portion, group (M₀-M₁) and a second main portion group (M₂-M₃). The size of the superblock 341A is five times the memory block size (i.e., 640 bytes). Memory boundaries of the superblock 341A are located at 0, 128, 256, 384, 512, and 640 bytes, and the starting addresses of all main portions M₀-M-₃ are aligned to the memory boundaries 0, 128, 384, and 512 bytes, respectively, The secondary portion group has, a size of 128 bytes and the first secondary portion S0 is aligned to the memory boundary 256 bytes.

Referring to the superblock 341B, the size of the access unit is set to 192 bytes, the memory block size is set to 128 bytes, and the sizes of the main and secondary portion are set to 128 and 64 bytes, respectively. As shown, the superblock pattern has a secondary portion group having four secondary portions (S₀-S₃) that is sandwiched between a first main portion group (M₀-M₁) and a second main portion group (M₂-M₃). The size of the superblock 341B is six times the memory block size (i.e., 768 bytes). Memory boundaries of the superblock 341B are located at 0, 128, 256, 384, 512, 640, and 768 bytes, and the starting addresses of all main portions M₀-M₃ are aligned to the memory boundaries 0, 128, 512, and 640 bytes, respectively. The secondary portion group has a size of 256 bytes and the first secondary portion S0 is aligned to the memory boundary 256 bytes.

Referring to the superblock 341C, the size of the access unit is set to 384 bytes, the memory block size is set to 256 bytes, and the sizes of the main and secondary portion are set to 256 and 128 bytes, respectively. As shown, the superblock pattern has a secondary portion group having two secondary portions (S₀-S₁) that is sandwiched between a first main portion M₀ and a second main portion M₁. The size of the superblock 341C is three times the memory block size (i.e., 768 bytes). Memory boundaries of the superblock 341C are located at 0, 256, 512, and 768 bytes, and the starting addresses of the main, portions M₀-M₁ are aligned to the memory boundaries 0 and 512 bytes, respectively. The secondary portion group has a size of 256 bytes and the first secondary portion S₀ is aligned to the memory boundary 256 bytes.

FIG. 4 shows three exemplary superblocks 441A-441C in three frame buffers 431A-431C according to an embodiment of the disclosure. The superblocks 441A-441C share a common feature where all secondary portions are collected as a secondary portion group that follows a main portion group having main portions. The starting addresses of the main portions are aligned to memory boundaries. The size of the secondary portion group is selected to be one or multiples of the memory block size and the first secondary portion in the secondary portion group being aligned to a memory boundary.

Referring to the superblock 441A, the size of the access unit is set to 192 bytes, the memory block size is set to 128 bytes, and the sizes of the main and secondary portion are set to 128 and 64 bytes, respectively. As shown, the superblock pattern 441A has a secondary portion group having two secondary portions (S₀-S₁) that follows a main portion group (M₀-M₁), The size of the superblock 441A is three times the memory block size (i.e., 384 bytes). Memory boundaries of the superblock 441A are located at 0, 128, 256, and 384 bytes, and the starting addresses of all main portions M₀-M₁ are aligned to the memory boundaries 0 and 128 bytes, respectively. The secondary portion group has a size of 128 bytes and the first secondary portion S₀ is aligned to the memory boundary 256 bytes.

Referring to the superblock 441B, the size of the access unit is set to 160 bytes, the memory block size is set to 128 bytes, and the sizes of the main and secondary portion are set to 128 and 32 bytes, respectively. As shown, the superblock pattern has a secondary portion group having four secondary portions (S₀-S₃) that follows a main portion group (M₀-M₃). The size of the superblock 441B is five times the memory block size 640 bytes). Memory boundaries of the superblock 441B are located at 0, 128, 256, 384, 512, and 640 bytes, and the starting addresses of all main portions M₀-M₃ are aligned to the memory boundaries 0, 128, 256, and 384 bytes, respectively. The secondary portion group has a size of 128 bytes and the first secondary portion S₀ is aligned to the memory boundary 512 bytes.

Referring to the superblock 441C, the size of the access unit is set to 320 bytes, the memory block size is set to 128 bytes, and the sizes of the main and secondary portion are set to be 256 and 64 bytes, respectively. As shown, the superblock pattern has a secondary portion group having two secondary portions (S₀-S₁) that follows a main portion group (M₀-M₁). The size of the superblock 441C is five times the memory block size (i.e., 640 bytes). Memory boundaries of the superblock 441C are located at 0, 128, 256, 384, 512, and 640 bytes, and the starting addresses of all main portions M₀-M₁ are aligned to the memory boundaries 0 and 256 bytes, respectively The secondary portion group has a size of 128 bytes, and the first secondary portion S₀ is aligned to the memory boundary 512 bytes.

FIG. 5 shows two exemplary superblocks 541A-541B in two frame buffers 531A-531B according to an embodiment of the disclosure. The superblocks 541A-541B share a common feature where the size of the main portion (128 bytes) is smaller than the memory block size (256 bytes). In addition, some compressed access units may need to be stored in both the main and secondary portions, while some compressed access units can be stored completely in the main portions. To allow for efficient access to the compressed access units stored in the main and secondary portions, as many as possible secondary portions corresponding to main portions can be included in the same memory blocks, and preferably immediately follow the respective main portions. For example, the main portions M₀ and M₃ are followed by their respective secondary portions S₀ and S₃ in their respective memory blocks in the superblocks 541A.

Referring to the superblock 541A, the size of the access unit is set to 192 bytes, the memory block size is set to 256 bytes, and the sizes of the main and secondary portion are set to 128 and 64 bytes, respectively. As shown, the superblock pattern has a secondary portion S₀ following the respective main portion M₀ and a secondary portion S₃ following the respective main portion M₃. The size of the superblock 541A is three times the memory block size (i.e., 768 bytes). Memory boundaries of the superblock 541A are located at 0, 256, 512, and 768 bytes, and the starting addresses of three main portions M₀, M₁, and M₃ are aligned to the memory boundaries 0, 256, and 512 bytes, respectively. The starting address of the main portion M₂ is not aligned to a memory boundary, but M₂ is within a single memory block between 256 and 512 bytes.

Referring to the superblock 541B, the size of the access unit is set to 192 bytes, the memory block size is set to 256 bytes, and the sizes of the main and secondary portion are set to 128 and 64 bytes, respectively. As shown, the superblock pattern has a secondary portion S₀ following the respective main portion M₀, and a secondary portion S₁ following the respective main portion M₁. The size of the superblock 541B is three times the memory block size (i.e., 768 bytes). Memory boundaries of the superblock 541B are located at 0, 256, 512, and 768 bytes, and the starting addresses of three, main portions M₀, M₁, and M₂ are aligned to the memory boundaries 0, 256, and 512 bytes, respectively. The starting address of the main portion M₃ is not aligned to a memory boundary, but M₃ is within a single memory block between 512 and 768 bytes.

FIG. 6 shows alternative frame buffer examples according to embodiments of the disclosure. The main and secondary portions the frame buffers 631A and 631B can be arranged by having two groups, a main portion group and a secondary portion group, wherein the main portion group includes all the main, portions sequentially positioned next to each other while the secondary portion group includes all the secondary portions sequentially positioned next to each other. In an embodiment, the size of the main portion is one or multiples of the memory block size, the starting addresses of the main portions can be aligned to memory boundaries. The main portion group can be positioned adjacent to the secondary portion group or can be separated from the secondary portion group.

Referring to the frame buffer 631A, the size of the access unit is set to 80 bytes, the memory block size is set to 64 bytes, and the sizes of the main and secondary portion are set to 64 and 16 bytes, respectively. The main portion group includes all the main portions. As shown, the main portions can be positioned next to each other and have starting addresses aligned to consecutive memory boundaries at 0, 64, 128, 192, 256 bytes and etc. The, secondary portion group includes all the secondary portions. The first secondary portion S₀ can have a starting address aligned to a memory boundary such as 512 bytes.

Referring to the frame buffer 631B, the size of the access unit is set to 160 bytes, the memory block size is set to 128 bytes, and the sizes of the main and secondary portion are set to 128 and 32 bytes, respectively. The main portion group includes all the main portions. As shown, the main portions can, be positioned next to each other and have starting addresses aligned to consecutive memory boundaries at 0, 128, 256, 384, 512 bytes and etc. The secondary portion group includes all the secondary portions. The first secondary portion S₀ can have a starting address aligned to a memory boundary such as 4096 bytes.

In the superblocks and frame buffers shown in FIGS. 3-6, at least one of the secondary portions is not sequentially located after its respective main portion.

In an embodiment, starting addresses of superblocks and frame buffers can be aligned to memory boundaries of the memory 130, such as 0 bytes shown in FIGS. 3-6.

While exemplary superblocks and frame buffers are shown in FIGS. 3-6, it should be understood that variations, such as variations of superblock patterns, the positioning of superblocks in frame buffers, and the like, are possible in order to satisfy different memory usage situations.

During operation, when the main and secondary portions are positioned in the frame buffer 131 according to a layout, such as those shown in FIGS. 3 to 6, compressed access units can be stored in the respective main portions, For example, a compressed access unit can be stored in the respective main portion and, if necessary, respective secondary portion. When the size of the compressed access unit is equal to or smaller than that of the respective main portion, the compressed access unit can be completely stored inside the respective main portion and the corresponding secondary portion can remain empty.

FIG. 7 shows a flow chart outlining an exemplary process 700 according to an embodiment of the disclosure. In an example, the process 700 is executed by the memory system 100 in FIG. 1, The process starts at S701 and proceeds to S710.

At S710, an input image is divided into one or more access units, such as an N×M array of access units as shown in FIG. 2. In an example, the memory distribution device 110 is configured to divide the input image into an array of access units. The input image can be a video frame, a. photographic picture, a graphic art, an animated picture, and the like. For example, a video frame can be a reference frame used by a video codec 146. The process then proceeds to S720.

At S720, a frame buffer is allocated in a memory. In an example, the memory distribution device 110 is configured to allocate the frame buffer 131 in the memory 130. The size of the frame buffer is equal to or larger than a size of the input image. In one embodiment, the starting address of the frame buffer can be aligned to a memory boundary, such as 0 bytes.

At S730, two memory portions, a main portion and a secondary portion, are allocated for each access unit in the frame buffer. In an example, the memory distribution device is configured to allocate a main portion and a secondary portion for each access unit in the frame buffer 131. In an embodiment, the summation of a size of the main portion and a size of the secondary portion can be equal to a size of the access unit and, for example, it can be a size of an uncompressed access unit. In an embodiment, the size of the main portion can depend on the compressibility of the input image, the compression method, the memory block size, and the like. In addition, in an embodiment, a ratio of the size of the main portion over the size of the secondary portion can depend on the compressibility of the input image, the compression method, and the like. For example, when the access units can be compressed to smaller sizes, smaller main portions can be sufficient to store the compressed access units and the respective secondary portions can remain empty, resulting in a smaller ratio of the size of the main portion over the size of the secondary portion.

Further, in an embodiment, the main portions can have starting addresses that are aligned to memory boundaries and sizes that are one or multiples of the memory block size so that the data stored in the main portions can be efficiently accessed.

Alternatively, when the size of the main portion is smaller than the memory block size, each main portion can be located within the respective memory block while one or more main portions can have starting addresses aligned to one or more memory boundaries.

In an embodiment, the size of the secondary portion can be a fraction of the memory block size. As a result, two or more secondary portions can be grouped together as one or more secondary portion groups and stored separately from their respective main portions.

Further, in an embodiment, the first secondary portion in each respective secondary portion group can have starting address aligned to a memory boundary.

In another embodiment, at least one of the secondary portions is not sequentially located after its respective main portion within the frame buffer.

The main and secondary portions can be arranged in the frame buffer, such as the frame buffer 131, according to various layouts. In one embodiment, the layouts can include repeating patterns of superblocks where a superblock is a smallest repeating unit in the frame buffer. Accordingly, the main and secondary portions in the frame buffer cart be arranged by, for example, sequentially positioning superblocks next to each other. The size of superblocks can be set to multiples of the memory block size.

In an embodiment, the starting addresses of the main portions in the superblock are aligned to memory boundaries. The secondary portions in the superblock can be grouped into one or more secondary portion groups having a size being multiples of the memory block size. The first secondary portion in each secondary portion group can be aligned to a memory boundary. Some exemplary superblocks having the above features are shown in FIGS. 3 and 4.

In another embodiment, a superblock can have one or more main portions having a size smaller than the memory block size. Some exemplary superblocks are shown by FIG. 5. For example, as many as possible secondary portions immediately follow the respective main portions (such as S₀ follows M₀ and S₃ follows M₃ in 541A of FIG. 5). In another example, each main portion can be located completely within the same memory block.

In one embodiment, a layout does not include repeating patterns of superblocks. Instead, the main and secondary portions in the frame buffers can be arranged by having a main portion group and a secondary portion group, such as the examples shown by FIG. 6. For example, the main portion group includes the main portions having starting addresses aligned to consecutive memory boundaries. The secondary portion group includes the secondary portions positioned next to each other. The first secondary portion of the secondary portion group can be aligned to a memory boundary.

At S740, an access unit can be compressed into a compressed access unit to reduce the bandwidth requirement of data transfer between the memory and another device accessing the memory. For example, in the memory system 100, the memory 130 can be located on a different chip than the memory distribution device 110, the memory distribution device 110 is configured to compress the access units to reduce the large bandwidth requirement of data transfer between the memory 130 and the memory distribution device 110. Both lossless and lossy compression methods can be used to compress the access units. Lossless compression methods can preserve the quality of the initial data while lossy compression methods can achieve more compression. The compression methods can be generic compression methods, or image compression methods, or video compression methods, or the like. For example, compression methods can include run-length encoding, dictionary based algorithm, Hoffman coding, deflation, chroma subsampling, discrete cosine transform, or the like.

At S750, the size of the compressed access unit can be compared with the size of the main portion. In an example, the memory distribution device 110 is configured to compare the sizes of the compressed access unit and the main portion. If the size of the compressed access unit is larger than the size of the main portion, the process then proceeds to S770. Otherwise, the process then proceeds to S760.

At S760, the compressed access unit can be stored completely in the respective main portion because the size of the compressed access unit is less than or equal to the size of the main portion. In an example, the memory controller 120 is configured to store, in response to instructions of the memory distribution device 110, the compressed access unit into its respective main portion.

When the size of the compressed access unit is larger than the size of the main portion, the process proceeds to S770. At S770, a first part of the compressed access unit can be stored in the respective main portion. The first part of the compressed access unit can have the same size as the main portion and fill the respective main portion. In an example, the memory controller 120 is configured to store, in response to instructions of the memory distribution device 110, the first part of the compressed access unit into its respective main portion.

At S780, a second part, or the remainder part, of the compressed access unit can be stored in the respective secondary portion. Therefore, the compressed access unit can be stored separately in the respective main and secondary portion when the size of the compressed access unit is larger than the size of the main portion. In an example, the memory controller 120 is configured to store, in response to instructions of the memory distribution device 110, the remainder part of the compressed access unit into its respective secondary portion.

S740-S780 can be repeatedly performed on all the access units before the process proceeds to S799, where the process terminates. In an example, the memory distribution device 110 and the memory controller 120 are configured to perform S740-S780 repeatedly for all the access units in the input image.

In various embodiments, the size of the access unit and the sizes of the main and secondary portion can be selected and remain constant for an input image. On the other hand, multiple input images, such as sequential frames of a video, can be stored by a memory. The size of the access unit and the sizes of the main and secondary portion can be selected for each individual input image, and thus can dynamically vary from one input image to another.

In various examples, the memory distribution device 110 or functions of the memory distribution device 110 can be implemented with hardware, software, or combination thereof. In one example, the memory distribution device 110 is implemented in hardware, such as a processing circuit, the hardware may comprise one or more of discrete components, an integrated circuit, an application-specific integrated circuit (ASIC), and the like. In another example, functions of memory distribution can be implemented with software or firmware including instructions stored in a computer-readable non-volatile storage medium. The instructions, when executed by a processing circuit, causing the processing circuit to perform the respective functions.

While aspects of the present disclosure have been described in conjunction with the specific embodiments thereof that are proposed as examples, alternatives, modifications, and variations to the examples may be made. Accordingly, embodiments as set forth herein are intended to be illustrative and not limiting. There are changes that may be made without departing from the scope of the claims set forth below. 

What claimed is:
 1. A method for storing an input image into a memory, comprising: allocating one or more frame buffers in the memory; dividing the input image into access units corresponding to subsets of the input image and allocating a main portion and, a secondary portion in the frame buffer for each of the access units, wherein at least one of the secondary portions is not sequentially located after its respective main portion within the frame buffer; compressing the access units into compressed access units; and storing each of the compressed access units into its respective main portion, and if a size of the compressed access unit exceeds a size of the main portion, then storing a remainder of the compressed access unit into its respective secondary portion.
 2. The method of claim 1, wherein the memory has a sequence of memory blocks separated by memory boundaries at addresses that are multiples of a memory block size and the memory block size is determined based on characteristics of the memory and devices accessing the memory.
 3. The method of claim 2, wherein a size of the main portions is one or multiples of the memory block size and starting addresses of the main portions are aligned to memory boundaries.
 4. The method of claim 2, wherein a size of the main portions is a fraction of the memory block size and each main portion is located within a respective memory block.
 5. The method of claim 2, wherein a size of the secondary portions is a fraction of the memory block size and secondary portions are grouped into one or more secondary portion groups.
 6. The method of claim 5, wherein the one or more secondary portion groups have a size being one or multiples of the memory block size and the first secondary portion of each secondary portion group has a starting address aligned to a memory boundary.
 7. The method of claim 2, wherein a plurality of main portions and secondary portions are arranged in a predetermined pattern to form a superblock having a size that is one or multiples of the memory block size and the main and secondary portions in the frame buffer are arranged by sequentially positioning a plurality of superblocks adjacent to each other.
 8. The method of claim 2, wherein the memory block size is selected to be 32, 64, 128, 256, 512, 1K, 2K, or 4K bytes.
 9. The method of claim 1, wherein the input image is a still image or a video frame.
 10. A device for storing an input image, comprising: a memory configured to store one or more frame buffers; a memory distribution device configured to receive the input image, allocate a frame buffer in the memory to store the input image, divide the input image into access units corresponding to subsets of the input image, and allocate a main portion and a secondary portion in the frame buffer for each of the access units, wherein at least one of the secondary portions is not sequentially located after its respective main portion within the frame buffer; and a memory controller configured to store, in response to instructions of the memory distribution device, each of the compressed access units into its respective main portion, and if a size of the, compressed access unit exceeds a size of the main portion, then store a remainder of the compressed access unit into its respective secondary portion.
 11. The device of claim 10, wherein the memory has a sequence of memory blocks separated by memory boundaries at addresses that are multiples of a memory block size and the memory block size is determined based on characteristics of the memory and devices accessing the memory including the memory distribution device.
 12. The device of claim 11, wherein the memory distribution device is configured to select a size of the main portions to be one or multiples of the memory block size and align starting addresses of the main portions to memory boundaries.
 13. The device of claim 11, wherein the memory distribution device is configured to select a size of the main portions to be a fraction of the memory block size and position each main portion within a respective memory block.
 14. The device of claim 11, wherein the memory distribution device is configured to a select a size of the secondary portions to be a fraction of the memory block size and group secondary portions into one or more secondary portion groups.
 15. The device of claim 14, wherein the memory distribution devices configured to select a size of the one or more secondary portion groups to be one or multiples of the memory block size and align the starting address of the first secondary portion of each secondary portion group to a memory boundary.
 16. The device of claim 11, wherein the memory distribution device is configured to arrange a plurality of main portions and secondary portions in a predetermined pattern to form a superblock having a size being one or multiples, of the memory block size and further position a plurality of superblocks adjacent to each other in the frame buffer.
 17. The device of claim 10, wherein the memory is located on a different integrated chip than the memory distribution device.
 18. The device of claim 10, wherein the memory distribution device is integrated into a video codec.
 19. The device of claim wherein the memory, distribution device is configured to determine the memory block size to be 32, 64, 128, 256, 512, 1K, 2K, or 4K bytes.
 20. A non-transitory computer readable medium having computer readable instructions stored thereon which, when executed by a processing circuit, cause the processing circuit to perform a method, the method comprising: allocating one or more frame buffers in the memory; dividing the input image into access units corresponding to subsets of the input image and allocating a main portion and a secondary portion in the frame buffer for each of the access units, wherein at least one of the secondary portions is not sequentially located after its respective main portion within the frame buffer; compressing the access units into compressed access units; and, storing each of the compressed access units into its respective main portion, and if a size of the compressed access unit exceeds a size of the main portion, then storing a remainder of the compressed access unit into its respective secondary portion. 